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In 2012 Michigan implemented Great Start to Quality, a voluntary quality rating and 
improvement system (QRIS) that uses a multidimensional assessment system to rate the 
quality of early childhood education programs. Regional Educational Laboratory Midwest 
examined how changes to Great Start to Quality’s rating calculation approach that 
were announced in 2013 affected program quality ratings. Under the revised approach 
approximately a third of programs had a higher simulated QRIS rating, although the 
underlying data and measures of program quality were unchanged. The simpler total 
score approach yielded simulated QRIS ratings that were nearly identical to those under 
the revised approach. Minor changes in the calculation approach can yield substantial 
changes to ratings for individual programs and to the distribution of ratings across the 
state. When changing the calculation approach, QRIS administrators must consider the 
cost implications as well as the tradeoffs between having a simple, transparent approach 
and the benefits of including minimum required scores for different aspects of quality. 


This brief summarizes the findings of Faria, A-M., Hawkinson, L., Greenberg, A., Howard, E., & Brown, 
L. (2015). Examining changes to Michigan's early childhood quality rating and improvement system (QRIS) 
(REL 2015-029). Washington, DC: U.S. Department of Education, Institute of Education Sciences, 
National Center for Education Evaluation and Regional Assistance, Regional Educational Laboratory 
Midwest. That report is available at http://ies.ed.gov/ncee/edlabs/projects/project.asp?ProjectID=355. 
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Why this study? 


As documenting and improving early childhood program quality have become higher national priorities, 
quality rating and improvement systems (QRISs)—multidimensional assessment systems created to rate the 
quality of early childhood education and child care programs, encourage early childhood education and 
child care settings to provide higher quality experiences for young learners, and increase the amount of 
clear and reliable information available to families about these programs—have expanded rapidly (Mitch- 
ell, 2005). In the 1990s QRISs grew out of early tiered reimbursement strategies in the child care subsidy 
system to promote and reward high-quality care. Thus, they initially focused on licensed, subsidized child 
care centers. In the mid-2000s the focus shifted from using QRISs to rate subsidized programs to using 
QRISs as a policy mechanism to improve early child care quality across multiple program settings. States 
have expanded QRISs to include a diverse range of program types (for example, school-based prekinder- 
garten programs, state-funded prekindergarten programs, Head Start and Early Head Start programs, fami- 
ly-based care, home-based care, and afterschool programs) and now make the ratings public to help families 
incorporate quality into their child care decisions. Most systems are now voluntary and linked to financial 
incentives or receipt of child care subsidies to promote participation (Tout et al., 2010). 


With attention focused on the potential of high-quality early childhood education to reduce school readi- 
ness gaps (for example, Christenson & Reschly, 2010), all states except Missouri! have begun to implement 
or plan to implement some form of QRIS (QRIS National Learning Network, 2014). However, each system 
differs in its approach to defining and rating quality. States differ in how they define and measure child care 
quality in their QRIS, though some domains of quality are common across many states’ systems, including 
licensing compliance (26 states), staff qualifications (26), environment (24), family partnership (24), adminis- 
tration and management (23), and accreditation (21; Caronongan, Kirby, Malone, & Boller, 2011; Tout et al., 
2010).? Although consensus is growing among experts regarding which components of program quality are 
most closely related to child development (for example, components of process quality, including supportive 
teacher—child interactions and use of an evidence-based curriculum; Sabol, Soliday Hong, Pianta, & Burchi- 
nal, 2013), consensus is still lacking on how to measure and rate these components in a state-developed QRIS. 


In fact, recent studies using nationally representative datasets to simulate QRIS ratings, such as the Early 
Childhood Longitudinal Study, Birth Cohort, found that the rating calculation approach can considerably 
alter the distribution of quality ratings. These results suggest that even fairly minor changes to a state’s 
rating calculation approach can have considerable implications for the distribution of ratings (Tout, Chien, 
Rothenberg, & Li, 2014). The distribution of ratings is important because it signals the level of quality of 
early care offered across the state, because the ratings are used to inform program funding (for example 
some QRISs link ratings with increased child care subsidy receipt or financial bonuses), and because fami- 
lies use the ratings to inform their child care decisions. 


This study, a collaboration between Regional Educational Laboratory (REL) Midwest and the state of 
Michigan’s Office of Great Start, examined how self-assessment ratings from Great Start to Quality, Mich- 
igan’s QRIS, were awarded under the QRIS’s original rating calculation approach. The study also analyzed 
how small changes to the state’s approach to calculating self-assessment ratings affected the distribution of 
simulated self-assessment ratings for individual early childhood education and child care programs across 
the state (box 1). 


The target audience for this report includes state-level QRIS administrators, who can use the findings to better 
understand the implications of changes to the rating calculation approach. Simplifying the approach or relax- 
ing one of the criteria may have associated costs—for example, because of the need to conduct more observa- 
tions, to reimburse more programs at higher quality tier rates, or to provide new training about the QRIS. 


Box 1. Data and methods 


The study used data on 2,390 Michigan early childhood education programs (including private center-based 
programs, Head Start programs, Early Head Start programs, state prekindergarten programs, and family child 
care providers). Data included: 

° Self-assessment total scores that ranged from O to 50 based on Michigan’s Self-Assessment Survey 
(n = 2,390). 

° Self-assessment ratings that ranged from 1 to 5 based on Michigan’s Self-Assessment Survey with applied 
quality rating and improvement system (QRIS) minimum required scores (n = 2,390). 

e Independent observation of quality scores that ranged from 1 to 5 based on the Preschool Program Quality 
Assessment (n = 72). 

e =QRIS ratings that ranged from 1 to 5 that combined information from the Self-Assessment Survey and the 
Preschool Program Quality Assessment (n = 1,413). 

e Simulated self-assessment ratings that ranged from 1 to 5 created by applying new cutoff scores and a 
new calculation approach to the self-assessment total scores (n = 2,390). To create the simulated scores, 
the study team applied the new minimum required scores under the revised approach (from version 2.0 of 
Great Start to Quality) and the total score approach to the self-assessment scores under the original QRIS 
approach. The simulated QRIS ratings are based only on self-assessment scores and do not incorporate 
independent observation of quality scores. 

The study team used a mix of analyses to answer the research questions. Descriptive statistics were used 
to document the distribution of QRIS ratings in the original systems. Pearson correlations were calculated to 
determine how self-assessment ratings and independent observations of quality scores were related. Finally, 
self-assessment ratings from the original QRIS and simulated QRIS ratings using different rating calculation 
approaches were compared to understand the implications of changes to the QRIS. 


What the study examined 


Michigan developed its QRIS, Great Start to Quality, in the early 2000s as a voluntary rating system for 
early childhood programs serving children from birth to age 5. It was rolled out statewide in 2012, a result 
of the political will and buy-in generated through Michigan’s application to the Race to the Top—Early 
Learning Challenge grant. The original rating calculation approach was based on points from the two 
instruments used to measure quality in the QRIS: the SelfAssessment Survey and the independent obser- 
vation of quality conducted by a state-trained observer. 


Programs are assigned a self-assessment rating of level 1-5 (where level 1 indicates the lowest quality and 
level 5 indicates the highest quality) based on their SelfAssessment Survey scores. To determine the 
self-assessment rating, programs must meet a minimum number of points in each of five domains as well as 
a minimum number of overall points. Programs with a self-assessment rating of level 1, 2, or 3 receive a cor- 
responding QRIS rating of 1, 2, or 3. Programs with a self-assessment rating of level 4 or 5 may voluntarily 
participate in the independent observation of quality to receive a final public QRIS rating of level 4 or 5 


In January 2013 the state decided to revise its approach to calculating ratings as implementation of Great 
Start to Quality expanded, which many states across the country also were doing. The changes went into 
effect in June 2013. Under the initial approach (version 1.0), the requirements for each rating were based 
on the total number of points on the SelfAssessment Survey, the number of points in each domain on the 
Self-Assessment Survey, and an independent observation of quality. The revised approach (version 2.0) 
continues to use both the SelfAssessment Survey and the independent observation of quality but applies 
different cut scores on the Self-Assessment Survey and changes the requirements for earning points on the 
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staff qualifications subdomain (see table Cl in appendix C in the full report for more detail). Version 2.0 
still requires a minimum number of points for each domain of the SelfAssessment Survey but reduces the 
number of domains in which programs must meet those minimum required points for levels 2—4 (table 1). 


The state was also interested in how a total score approach would affect the number of programs at each 
QRIS rating level, so the study team developed a simple alternative total score approach that eliminat- 
ed criteria for domain scores on the Self-Assessment Survey (see table 1). The change in the calculation 
system offered an opportunity to research how small changes to a rating calculation approach can shift the 
distribution of QRIS ratings across the state. The study addressed four research questions: 


1. How many programs are rated as low, moderate, and high quality under the version 1.0 approach of 
Great Start to Quality? 


2. How consistent are the self-assessment ratings and independent observations of quality under the 
version 1.0 approach of Great Start to Quality? 


3. How are domain scores related to overall self-assessment ratings under the version 1.0 approach of 
Great Start to Quality? 


4. How do the distributions of self-assessment and final ratings under Great Start to Quality change with 
alternative approaches to calculating ratings? 


Research questions 1-3 provide preliminary data on the landscape of early childhood program quality, 
based on ratings under the version 1.0 approach of Great Start to Quality. Research question 4 informs 
key stakeholders in Michigan and other states about how small changes to the rating calculation approach 
(such as the version 2.0 and total score approach under Great Start to Quality) can influence ratings. 


Table 1. Domain score requirements differ in the three rating calculation approaches examined in 
the study 


Revised Great Start to 
Quality approach 


Total score 
approach 


Self assessment Original Great Start to 


Quality approach 


rating 


Level 1 Programs must be licensed. Programs must be licensed. Programs must be licensed. 

Level 2 Minimum total points: 16 Minimum total points: 16 Minimum total points: 16 
Programs must meet minimum Programs must meet minimum No domain score requirements. 
required scores in all five required scores in any two 
domains. domains. 

Level 3 Minimum total points: 26 Minimum total points: 26 Minimum total points: 26 
Programs must meet minimum Programs must meet minimum No domain score requirements. 
required scores in all five required scores in any three 
domains. domains. 

Level 4 Minimum total points: 38 Minimum total points: 38 Minimum total points: 38 
Programs must meet minimum Programs must meet minimum No domain score requirements. 
required scores in all five required scores in any four 
domains. domains. 

Level 5 Minimum total points: 42 Minimum total points: 42 Minimum total points: 42 


Programs must meet minimum 
required scores in all five 
domains. 


Programs must meet minimum 
required scores in all five 
domains. 


No domain score requirements. 


Note: Data are as of January 16, 2013. Programs must be licensed to participate in Michigan’s Great Start to Quality, and all 
licensed programs enter the QRIS at level 1. 


Source: Original and revised Great Start to Quality, materials shared by the Michigan Department of Education’s Office of Great Start; 


total score approach, developed by the authors in partnership with the Michigan Department of Education’s Office of Great Start. 


What the study found 


This section details the main findings of the study. 


Programs had higher simulated ratings under the revised and total score rating calculation approaches than under the 
original approach 


Under Great Start to Quality’s original rating calculation approach, most programs had a self-assess- 
ment rating at the lowest (1) or highest (5) level (figure 1). Under the revised approach, fewer programs 
had a simulated QRIS rating of level 1 and more programs had a rating of level 2, 3, or 4. The number 
of programs at level 5 did not change substantially because the criteria for the highest rating were 
essentially the same under both approaches. The number of programs rated at level 1 dropped consid- 
erably, as those programs attained higher ratings due to reduced domain requirements in the revised 
approach. The median rating was level 3 under the original approach and level 4 under the revised 


approach. These findings suggest that the revised approach makes it easier for programs to receive a 
higher QRIS rating. 


The distribution of simulated QRIS ratings was almost identical under the total score approach and the revised 
approach 


If the state used only the self-assessment total scores and removed all domain score requirements, the 
resulting self-assessment ratings would be very similar to those under the state’s revised approach. In fact, 
the distribution of ratings under the total score approach and the revised approach was almost identi- 
cal (see figure 1). The only difference is that 19 programs would move from level 4 to level 5 under the 
total score approach. Given this finding, Michigan could achieve a similar distribution of QRIS ratings by 
either reducing the number of domains that have minimum required scores or removing all domain score 
requirements. 


Figure 1. Self-assessment ratings were higher under the revised approach to calculating ratings 
than under the original approach of Michigan’s Great Start to Quality and similar to those under 
the total score approach 


Number of programs 


1,000 - Original approach _m Revised approach m Total score approach 
750 
500 + 
. i I I 
(e) 
Level 1 Level 2 Level 3 Level 4 Level 5 
Note: Of the 3,941 programs that participated in Michigan’s Great Start to Quality, 2,390 (60.6 percent) completed the Self- 


Assessment Survey. Data are as of January 16, 2013. 


Source: Authors’ calculations based on data provided by the Michigan Department of Education’s Office of Great Start. 


For the 72 programs that completed both the Self-Assessment Survey and an independent observation of quality, self- 
assessment ratings were higher than independent observation scores 60 percent of the time 


The study also examined how consistent the self-assessment ratings and independent observations of quality 
scores were under the original rating calculation approach of Great Start to Quality. For the 72 programs 
that completed both the Self-Assessment Survey and an independent observation of quality, selfassess- 
ment ratings were higher 60 percent of the time. Self-assessment total scores and independent observation 
of quality scores were not significantly associated (Spearman’s p = .188, p = .114), which suggests that the 
definition of quality in the two instruments may not be the same. Further research is needed to test the 
relationship between self-assessment ratings and independent observation of quality scores because this 
study sample had only 72 programs with both types of data. Now that the state has hundreds of programs 
rated with the independent observation of quality, the analyses could be replicated with the larger sample 
to see whether the lack of association persists. 


Implications of the study findings 


This section discusses the implications of the study findings for Michigan and for other states. 
For Michigan 


The findings from this study are directly relevant to the efforts of Michigan policymakers and Great Start 
to Quality administrators to monitor and improve the quality of early childhood education and care 
programs. 


Both the revised and total score approaches for self-assessment ratings make it easier for programs to 
achieve a middle quality rating than the original approach does. Under the original rating calculation 
approach, programs were required to meet standards in five domains at each rating level. Under the revised 
approach, the requirements have been relaxed considerably, especially at the lower rating levels. The rating 
criteria for levels 4 and 5 under both approaches still require high scores on most or all domains of the 
SelfAssessment Survey (by design in the revised approach and by default in the total score approach), 
and programs at those levels are also required to demonstrate high process quality on the independent 
observation of quality. In contrast, the domain score requirements for middle quality ratings (levels 2 and 
3) have been reduced considerably under both alternative approaches, so programs with lower ratings in 
either alternative approach could have very low scores in some domains of quality but still receive a middle 
quality rating. Under the original approach programs could not receive a middle quality rating without 
achieving this moderate level of quality in all domains. 


The new approaches also allow programs more flexibility to target particular aspects of quality to 
improve their rating. Under the revised and total score approaches, programs can target a subset of quality 
domains to improve their rating rather than increasing quality in all five domains, and programs have the 
flexibility to choose the specific domains to focus on. This could be seen as a big advantage to programs as 
they create their own quality improvement plans. However, one limitation of the total score approach is 
that it does not include as much diagnostic information for program administrators to know where to target 
their improvement strategies. 


Relaxing the domain requirements could also change the programs rated at levels 2 and 3. Some programs 
may be excellent in staff qualifications but low in curriculum and vice versa. The change in the definition 
of quality under the revised approach and total score approach allows for multiple pathways to high self 
assessment ratings as opposed to the singular pathway under the original approach. Although this may be 
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a benefit to programs for planning purposes, the changes could make it more difficult for families to use the 
ratings and to understand which programs are the best fit based on their needs. Michigan will need to con- 
sider the tradeoffs of using an approach that reduces or eliminates domain score requirements and to decide 
which is more important—the benefit of using a simple calculation approach or the benefit of providing 
child care consumers as well as programs themselves with more detailed information on program quality. 


Both the revised and total score approaches entail additional costs because more programs are eligible 
for independent observations of quality (by reaching level 4 or 5), but using an alternate pathway to 
rate public prekindergarten or Head Start programs may be one way to offset the costs. Both the revised 
and total score approaches resulted in more programs in the study sample receiving a level 4 self-assessment 
rating, meaning that more programs will be eligible for costly independent observations of quality. In fact, 
the study team’s simulations suggest that more than 100 additional programs would be eligible for the inde- 
pendent observation of quality using the revised or total score approach. Administering the independent 
observation may require additional resources such as the labor costs of supporting independent observ- 
ers. Michigan initially faced delays in completing observations for all programs eligible under the original 
approach, and increasing the number of programs eligible for the independent observation of quality could 
exacerbate these challenges if states are unable to offset costs with alternate pathways to a high rating. 


In 2014 Michigan implemented an alternate pathway to high QRIS ratings for some programs, including 
the state’s Great Start to Readiness and Head Start programs and programs accredited by the National 
Association for the Education of Young Children or the National Association for Family Child Care. These 
programs account for about a third of programs that have been rated. Because accreditation requirements 
and program standards for Great Start to Readiness and Head Start are comparable to requirements for 
higher ratings under Great Start to Quality, the state exempted this subset of programs from the indepen- 
dent observation of quality, so only self-assessment ratings are considered for these programs to receive a 
level 4 rating (programs still have to participate in the independent observation of quality to receive a level 
5 rating). This reduces some of the burden of conducting independent observations of quality. 


For other states 


This study finds that small changes to the way QRIS ratings are calculated can lead to substantial changes 
in the distribution of ratings. That finding is consistent with other research that suggests that the distribu- 
tion of ratings depends heavily on the calculation approach (Tout et al., 2014). When modifying or reca- 
librating a QRIS, states should carefully consider the calculation approach and the ways that even small 
differences in rating criteria can affect both the overall distribution of ratings and the ratings of individual 
programs. 


Changing QRIS rating systems can pose challenges, but such challenges appear manageable. Nation- 
wide, QRIS development and refinement are ongoing, and systems are continually evolving. All seven 
states have made at least one change to their system (such as expanded eligibility or adjusting require- 
ments) since inception. Such changes require policymakers, providers, and families to keep up to date with 
the changes in requirements, which can be challenging. However, Michigan’s experience suggests that a 
state can recalibrate parts of its QRIS and programs will still participate in it. In fact, many states consider 
their QRIS part of a continuous quality improvement plan where minor adjustments and updates are a 
normal part of implementation. 


States may need to consider whether to rely more heavily on self-assessment ratings or independent 
observations of quality. This study found that self-assessment ratings tended to be higher than indepen- 
dent observations of quality scores for the 72 programs with both types of ratings and that no significant 


correlation existed between self-assessment ratings and independent observation of quality scores. This dif 
ference may be due in part to differences in the aspects of quality measured in the two instruments. Great 
Start to Quality was designed to ensure that the highest ratings go to programs that have a strong structural 
foundation and provide high-quality interactions and instruction. The SelfAssessment Survey measures 
mostly structural quality (program characteristics such as adult—child ratio and staff qualifications), while 
the independent observation of quality measures mostly process quality (observed quality of the interactions 
between adults and children in the classroom). Although structural quality tends to predict process quality 
(Burchinal, Cryer, Clifford, & Howes, 2002; National Institute of Child Health and Human Development 
Early Child Care Research Network, 2002; Pianta et al., 2005), the relationship is not perfect, and structural 
quality ratings should not be expected to align precisely with process quality ratings. For this reason, self-as- 
sessment ratings and independent observation of quality scores are likely to differ somewhat, which might 
reflect the added value of assessing process quality rather than poor alignment between the instruments. 


The findings also suggest that the self-assessment ratings and independent observation of quality scores 
may provide unique information about a program’s quality, although the number of programs included 
in the analysis was small and not representative of all programs in the state that would be eligible for 
observations. Another potential explanation for the differences in the two types of ratings is that the self 
assessment rating is completed by program staff, who are likely to provide a favorable rating of their own 
program, while the independent observation of quality is completed by an independent observer who may 
provide a more neutral rating. Additional research is needed to assess the relationship between the two 
aspects of quality measured in Michigan’s QRIS. Other states should consider the pros and cons of both 
aspects of program quality when designing their QRIS. 


Limitations of the study 


The data available as of January 2013 were limited because Michigan’s QRIS was fairly new. The state had 
completed just 72 independent observations of quality, even though 1,049 programs were eligible to receive 
an observation based on their self-assessment ratings. Furthermore, programs with independent observa- 
tion of quality scores were not randomly sampled and were disproportionately representative of certain 
locations in the state (see appendix E in the full report; Faria, Hawkinson, Greenberg, Howard, & Brown, 
2015). Given these limitations, caution is advised when interpreting findings based on analyses that used 
independent observations of quality scores. 


In addition, self-assessment ratings are based on self-report for 80 percent of programs and state-verified 
ratings for 20 percent of programs. This raises concerns about the reliability of selfassessment rating data 
because different modes of data collection were used for the two types of ratings. For more than 40 percent 
of programs with both types of ratings, the self-assessment ratings differed from the state-verified ratings. 
This raises the possibility of bias in the self-reported data. 


This study simulated self-assessment ratings under the revised and total score approaches, using data reported 
by programs under the original rating approach. Because programs might have self-rated differently if they had 
access to the revised calculation approach, the simulations may have some unknown and unsystematic bias. 


Finally, because Great Start to Quality is still voluntary—albeit with incentives to participate—the sample 
of programs and providers included in the analyses is not representative of all early childhood education 
programs in Michigan, only those rated in the QRIS. Differences between rated and nonrated programs 
may exist; however, because of the limited data on programs that are not rated, extensive comparisons of 
quality were not possible. Examining basic program characteristics such as license type and total enroll 
ment revealed some differences between participating and nonparticipating programs. 
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Notes 


Missouri requires legislative action to implement a QRIS. 

These numbers were accurate as of 2010, when the most recent compendium of QRISs was released 
(Tout et al., 2010). Although these data are now six years old, no other comprehensive document 
describes the domains measured in all 49 QRISs. 

To begin the QRIS rating process, licensed programs complete the Self-Assessment Survey and are 
assigned an overall self-assessment rating of 1-5 based on their preliminary score and minimum point 
requirements for domain scores. At this stage programs with a self-assessment rating of 1, 2, or 3 have 
completed the steps required for a final, public rating. Programs with a self-assessment rating of 4 or 5 
may voluntarily participate in the independent observation of quality and need to receive a minimum 
score to receive a QRIS score of level 4 or 5. If a program has a self-assessment rating at level 4 or 5, but 
has not yet participated in the independent observation of quality, it does not have a final public QRIS 
score, and no information is reported until the independent observation is completed. Once the inde- 
pendent observation is completed, the minimum required points on form A of the Program Quality 
Assessment determine a program’s final public QRIS score of level 3, 4, or 5. (See appendix A in the 
main report for more detail about how program ratings are assigned in Michigan’s QRIS.) 
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The National Center for Education Evaluation and Regional Assistance (NCEE) conducts unbiased 
large-scale evaluations of education programs and practices supported by federal funds; provides 
research-based technical assistance to educators and policymakers; and supports the synthesis and 
the widespread dissemination of the results of research and evaluation throughout the United States. 
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